AITopics | openai clip

Collaborating Authors

openai clip

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Meta CLIP 2: A Worldwide Scaling Recipe

Chuang, Yung-Sung, Li, Yang, Wang, Dong, Yeh, Ching-Feng, Lyu, Kehan, Raghavendra, Ramya, Glass, James, Huang, Lifei, Weston, Jason, Zettlemoyer, Luke, Chen, Xinlei, Liu, Zhuang, Xie, Saining, Yih, Wen-tau, Li, Shang-Wen, Xu, Hu

arXiv.org Artificial IntelligenceAug-4-2025

Contrastive Language-Image Pretraining (CLIP) is a popular foundation model, supporting from zero-shot classification, retrieval to encoders for multimodal large language models (MLLMs). Although CLIP is successfully trained on billion-scale image-text pairs from the English world, scaling CLIP's training further to learning from the worldwide web data is still challenging: (1) no curation method is available to handle data points from non-English world; (2) the English performance from existing multilingual CLIP is worse than its English-only counterpart, i.e., "curse of multilinguality" that is common in LLMs. Here, we present Meta CLIP 2, the first recipe training CLIP from scratch on worldwide web-scale image-text pairs. To generalize our findings, we conduct rigorous ablations with minimal changes that are necessary to address the above challenges and present a recipe enabling mutual benefits from English and non-English world data. In zero-shot ImageNet classification, Meta CLIP 2 ViT-H/14 surpasses its English-only counterpart by 0.8% and mSigLIP by 0.7%, and surprisingly sets new state-of-the-art without system-level confounding factors (e.g., translation, bespoke architecture changes) on multilingual benchmarks, such as CVQA with 57.4%, Babel-ImageNet with 50.2% and XM3600 with 64.3% on image-to-text retrieval.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2507.22062

Country:

North America (0.28)
Europe (0.28)
Asia (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

It Happened One Frame: incredibly accurate video content search with OpenAI CLIP

#artificialintelligenceJun-16-2022, 16:10:23 GMT

I love movies, so as a fun exercise for my fast.ai It's named "It Happened One Frame", in tribute to the classic 1934 romantic comedy "It Happened One Night". To use this app, all you need is the link to a Youtube video. For example, you could search "Macaulay Culkin screams with hands on his cheeks" in a Home Alone movie clip and get the screenshots that capture the most iconic scene in this classic. This particular image is so popular that you can easily get it from a google search.

brad pitt, spiderman, subtitle, (12 more...)

#artificialintelligence

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.40)

Add feedback

Pinaki Laskar on LinkedIn: #artificialintelligence #machinelearning #deeplearning

#artificialintelligenceAug-10-2021, 03:30:31 GMT

AI Researcher, Cognitive Technologist Inventor - AI Thinking, Think Chain Innovator - AIOT, XAI, Autonomous Cars, IIOT Founder Fisheyebox Spatial Computing Savant, Transformative Leader, Industry X.0 Practitioner At what stage of development are #artificialintelligence and #machinelearning now? We're living exciting times in the Narrow AI of Statistic ML/DL to be replaced by the Causal AI/ML/DL. Are there any new breakthrough results? OpenAI shocked the world a year ago with GPT-3. Google presented LaMDA and MUM, two AIs that will revolutionize chat-bots and the search engine, respectively.

artificialintelligence, graphic retrieval, pinaki laskar, (9 more...)

#artificialintelligence

Country: Asia > China > Beijing > Beijing (0.08)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.61)

Add feedback

Wu Dao 2.0 - Bigger, Stronger, Faster AI From China

#artificialintelligenceJul-19-2021, 09:25:35 GMT

It is no secret that China has COVID-19 under control. When you travel there you need to go through a 2-week hotel quarantine but once you are in the country, you are safe. Probably even safer than before COVID as wearing a mask is now part of the etiquette, and the many other viral respiratory diseases are likely to be on the decline. Hence, when I got invited to speak at the annual conference of the Beijing Academy of Artificial Intelligence (BAAI) in the AI for healthcare section, I readily accepted. The BAAI is a great platform for showcasing technology and talent across broad categories.

gpt-3, language model, wu dao 2, (14 more...)

#artificialintelligence

Country:

Asia > China > Beijing > Beijing (0.25)
Asia > China > Hong Kong (0.05)

Genre: Research Report (0.48)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.55)
Health & Medicine > Therapeutic Area > Immunology (0.55)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.61)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback

openai/CLIP

#artificialintelligenceMar-30-2021, 13:57:43 GMT

CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pairs. It can be instructed in natural language to predict the most relevant text snippet, given an image, without directly optimizing for the task, similarly to the zero-shot capabilities of GPT-2 and 3. We found CLIP matches the performance of the original ResNet50 on ImageNet "zero-shot" without using any of the original 1.28M labeled examples, overcoming several major challenges in computer vision. First, install PyTorch 1.7.1 and torchvision, as well as small additional dependencies, and then install this repo as a Python package. Returns the model and the TorchVision transform needed by the model, specified by the model name returned by clip.available_models(). The name argument can also be a path to a local checkpoint.

batch, clip model, openai clip, (6 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.40)

Add feedback